Mining of High Dimensional Data using Efficient Feature Subset Selection Clustering Algorithm (WEKA)

نویسندگان

  • Lakshmi Sarika
  • Satyanarayana Reddy
  • YongSeog Kim
  • Marcos Evandro Cintra
  • Trevor P. Martin
  • Maria Carolina Monard
چکیده

We exhibited the thought of data mining through the free and open source programming Waikato Environment for Knowledge Analysis (WEKA), which allows you to burrow own data for examples and cases. We moreover depicted about the first methodology for data mining — backslide — which allows you to anticipate a numerical worth for a given set of insight qualities. This method for dismemberment is most easy to perform and the base fit system for data mining, yet it filled a not too bad need as a prolog to WEKA and gave a not too bad example of how unrefined data can be changed into convincing information. We will take you through two additional data mining techniques that are hardly more mind boggling than a backslide model, however all the more compelling in their individual goals. Where a backslide model could simply accommodate you a numerical yield with specific inputs, these additional models grant

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach

Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...

متن کامل

Online Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features

Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...

متن کامل

A Framework for Mining High Dimensional Data for Feature Subset Selection

Features are representative characteristics of data sets. Identifying such fetures in a high dimensional dataset play an important role in real world applications. Data mining is best used to determine important features. Selecting important features from a subject of identified features can help in making expert decisions. However, efficient identification of such feature subset and selection ...

متن کامل

تعیین ماشین‌های بردار پشتیبان بهینه در طبقه‌بندی تصاویر فرا طیفی بر مبنای الگوریتم ژنتیک

Hyper spectral remote sensing imagery, due to its rich source of spectral information provides an efficient tool for ground classifications in complex geographical areas with similar classes. Referring to robustness of Support Vector Machines (SVMs) in high dimensional space, they are efficient tool for classification of hyper spectral imagery. However, there are two optimization issues which s...

متن کامل

Comparative Analysis of Data Mining Tools and Classification Techniques using WEKA in Medical Bioinformatics

The availability of huge amounts of data resulted in great need of data mining technique in order to generate useful knowledge. In the present study we provide detailed information about data mining techniques with more focus on classification techniques as one important supervised learning technique. We also discuss WEKA software as a tool of choice to perform classification analysis for diffe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014